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Abstract. We study the emergency of mutual cooperation in evolutionary prisoner's 
dilemma games when the players are located on a square lattice. The players can 
choose one of the three strategies: cooperation (C), defection (D) or "tit for tat" 
(T), and their total payoffs come from games with the nearest neighbors. During the 
random sequential updates the players adopt one of their neighboring strategies if the 
chosen neighbor has higher payoff. We compare the effect of two types of external 
constraints added to the Darwinian evolutionary processes. In both cases the strategy 
of a randomly chosen player is replaced with probability P by another strategy. In the 
first case, the strategy is replaced by a randomly chosen one among the two others, 
while in the second case the new strategy is always C . Using generalized mean-field 
approximations and Monte Carlo simulations the strategy concentrations are evaluated 
in the stationary state for different strength of external constraints characterized by 
the probability P. 



INTRODUCTION 

The successful applications of game theory in the area of economics and polit- 
ical decisions initiated its increasing development after the second world war [1]. 
Originally, the game theory is devoted to find the optimal strategy for a given 
game between two intelligent players. The straightforward developments involve 
the generalization toward the iterated games of n players with assuming local inter- 
actions among the spatially distributed players. The spatial evolutionary prisoner's 
dilemma games (SEPDG) has attracted a particular attention because of its appli- 
cability in the human and behavior sciences as well as in biology [2-6]. Nowadays 
the prisoner's dilemma game is considered to be the metaphor for studying the 
emergence of cooperation among selfish individuals. The emerging cooperation 
appears to be crucial at many transitions in evolution [7]. The first numerical 



investigations have shown that the cooperation can be maintained by very sim- 
ple strategies in the iterated games [5]. Very recently it is demonstrated that the 
players can be as simple as bacteriophages (viruses that infect bacteria) [8,9]. 

In these systems the players wish to maximize their individual income coming 
from games with other players. The prisoner's dilemma game is a simple version 
of the two-player matrix games where the players' income depend on their simulta- 
neous choice between two options. Following the widely accepted expressions each 
player can choose defection or cooperation with the other player. The defector 
reaches the highest payoff t (called temptation to defect) against the cooperator, 
which receives then the lowest reward s (called sucker's payoff). For mutual co- 
operation [defection] each player receives the same payoff r (reward for mutual 
cooperation) [p (punishment)]. The game is symmetric in the sense that player's 
income is independent of the player itself, it depends only on their choice. The men- 
tioned payoff values satisfy the inequalities t > r > p > s and 2r > t + s. These 
assumptions provide the largest total payoff for the mutual cooperators. Com- 
paring to this situation the defector reaches extra income against the cooperator 
whose loss exceeds the defector's benefit. Consequently, the choice of defection can 
be interpreted as an exploiting behavior. These are the main features for which the 
prisoner's dilemma games are used to study the emergence of mutual cooperation, 
altruism and ethic norms among selfish individuals [5,10]. 

The rational players should defect as this choice provides the larger income, 
independently of the partner's decision. However, this situation creates a dilemma 
for intelligent players as mutual cooperation would result in higher income for each 
of them than mutual defection does. 

In the iterated round-robin prisoner's dilemma games we can introduce some 
simple evolutionary processes without assuming intelligent players (who are capable 
to find the best strategy if it exists). These games are started from an initial set 
of strategies, which defines the player's decision in the knowledge of their previous 
choices. The evolutionary process is devoted to model the Darwinian selection 
principle among n (selfish) players whose total income comes from n — 1 games 
within a given round. In the simplest evolutionary models the worst player adopts 
the winner's strategy round by round. 

The numerical simulations have demonstrated the crucial role of the so-called "tit 
for tat" strategy in the emergence of mutual cooperation. Despite of its simplicity 
the "tit for tat" strategy won the computer tournaments conducted by Axelrod [5] . 
The "tit for tat" strategy cooperates in the first step and then always repeats his 
co-player's previous decision. This strategy cooperates forever with all the other 
so-called nice strategies which never defect first. Furthermore, its defection and 
cooperation can be interpreted as a punishment and forgiveness when reacting to 
the previous decision for other strategies. The most remarkable feature of this 
strategy is that it is capable to sustain the mutual cooperation among themselves 
in the presence of defectors. 

Early numerical investigations have also indicated the importance of local inter- 
actions because it favors the formation of cooperating colonies. In the simplest 



models the players are distributed on a lattice and the interaction (the games 
between two player as well as the strategy adoption) is limited to a given neigh- 
borhood. Evidently, the short range interactions enhance the role of fluctuations 
at the same time. These models can be well investigated by sophisticated methods 
of non-equilibrium statistical physics. 

For the numerical investigation of the spatial effects Nowak and May [11] have 
introduced an SEPDG model, which is equivalent to a two-state cellular automaton. 
Each lattice site can be in one of the two states C and D, representing the two simple 
strategy "always cooperate" and "always defect" respectively. The income for a 
given player (site) comes from games with its neighbors (and also with itself in some 
version of the model). According to the cellular automaton rule the players modify 
their strategy simultaneously in discrete time steps. Namely, each player adopts 
the best strategy found in its neighborhood. The step by step visualization of the 
strategy distribution on a two-dimensional lattice exhibits different spatio-temporal 
patterns (homogeneous and coexisting strategies, transitions between these states, 
competing interfacial invasions, etc.) depending on the payoff matrix, which is 
characterized by a single parameter. In these models the randomness is restricted 
to the initial states. In a subsequent work Nowak et al. [12] have extended the 
former models by allowing irrational strategy adoptions with some probability. The 
simulations indicated that the randomness favors the spreading of D strategies. 
These results have initiated systematic numerical investigations of many stochastic 
cellular automata [13-16]. 

The study of spatio-temporal patterns observed in nature, however, requires 
continuous time description [17,11]. Moreover, it is difficult to analyze the above 
mentioned stochastic cellular automata in the framework of generalized mean-field 
approximation, which is often used in non-equilibrium physics. To reduce the 
technical difficulties Szabo and Toke have suggested a simplified dynamics [18]. The 
systematic investigations of this model have justified that when tuning the model 
parameters the stationary state undergoes two consecutive phase transitions which 
belong to the directed percolation (DP) universality class [18,16]. Very recently this 
SEPDG model has been extended by allowing three strategies for the players [19]. 
In the present work this three-strategy model will be compared with its simplified 
version. During the model descriptions and discussion, our attention will be focused 
on the motivations, the elementary processes and their consequences as well as on 
the universal features relating the SEPDGs to the area of complex systems. 

SPATIAL EVOLUTIONARY MODEL WITH THREE 

STRATEGIES 

In the present spatial evolutionary prisoner's dilemma game the players are lo- 
cated on the sites x = of a square lattice, where i,j — 1, . . . , L. To avoid the 
undesired boundary effects we assume periodic boundary conditions. Each player 
follows one of the three strategies: D defects always; C cooperates unconditionally; 



T accommodating to the partner's strategy chooses defection against D and coop- 
eration with C and T. In fact the name T refers to the strategy "tit for tat" which 
first cooperates and later repeats the partner's previous decision. Consequently, 
after the first step the decisions of these two strategies are equivalent against C, 
D and themselves. The consequences of the different first decisions become irrel- 
evant if the strategy changes (defined below) are rare comparing to the frequency 
of games. At the site x the player's strategy is denoted by a three-component unit 
vector whose possible values are 



(1) 



corresponding to the D, C, and T strategies respectively. At a given time the state 
of the whole system is described by the variables s(x). 

For each player the total payoff comes from the games with its four nearest 
neighbors. Using the above formalism the total payoff m(x) for the player at site 
x is given as 
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where s*(x) is the transpose matrix of s(x) and the summation runs over the four 
nearest neighbors (Sx) . Accepting the simplified payoff matrix suggested by Nowak 
and May [11] M is given by the following expression: 



(3) 



where the only free parameter b (1 < b < 2) measures the temptation to defect. In 
the above mentioned notation the present payoff matrix corresponds to the choices: 
r — 1, p — 0, t — b, and s = — e in the limit e — > 0. 

To model the Darwinian selection rule the players are allowed to modify their 
strategy. In the simplest case the system evolution is governed by random sequential 
updates. It means that a randomly chosen player (e.g. at site x) adopts one of its 
neighboring strategy s(x + Sx), if m(x + Sx) > m(x) and this elementary process 
is iterated many times. 

Here it is worth mentioning that a state consisting only of C and T strategies 
leads to a uniform payoff distribution [m(x) = 4] and the above dynamics leaves 
this state unchanged. An example of a more complicated situation is given in 
Figure 1. The payoffs associated with the three different strategies, D, C and T, 
are explicitly given. 

The reader can easily check that inside a D region the defectors receive zero 
payoffs. The same is true for a solitary T surrounded by defectors. In the absence 
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FIGURE 1. The payoff distribution is indicated by figures on the square lattice for a given 
configuration of D (black box), C (empty), and T (open box) strategies. 

of C strategies, however, two (or more) neighboring T strategies will invade the D 
territories because their mutual cooperation gives them some incomes, while the 
defectors' payoff remain zero. 

In the presence of C strategies, however, the above situation becomes quite dif- 
ferent as the exploitation provides large incomes for the defectors. As a result, the 
defectors can invade the neighboring C or T sites for some configurations. This 
process dominates the time evolution for small T and large C concentrations as 
illustrated in a ternary diagram (see Figure 2). Note that the trajectories are two 
dimensional projections of a many dimensional space. Accordingly, there can be 
crossing of trajectories. As the average defector's payoff decreases with the C con- 
centration, sooner or later the T — > D invasion processes will govern the system 
evolution and, finally, all the D strategies extinct. Figure 2 shows clearly that the 
ratio of C and T strategies in the final (frozen) state depends on the initial state. 

It is emphasized that in the absence of T strategies the defectors will dominate 
the present system in the final state. It is not evident as in Figure 1 one can 
find many D-C pairs where C beats D. In general, these pairs are located along 
the horizontal and vertical straight fronts separating the D and C domains. The 
random sequential invasions, however, makes the smooth fronts irregular and this 
situation generally prefers the D — > C invasion to the opposite one. As a result, 
the "sharp" D fronts cut the C's domains into small pieces and finally all the Cs 
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FIGURE 2. Monte Carlo results for the time evolutions in the absence of mutation if the system 
is started from different (uncorrelated) initial states indicated by X symbols. 

will be eliminated. 

The reader can easily recognize that in most of the C-D (or T-D) confrontations 
the direction of dominance is not affected by the value of b within the prescribed 
region (1 < b < 2). The systematic analysis shows that there is only one situation 
when the value of b becomes important. Namely, if a defector has a payoff of 2b 
while its C (or T) counterpart has 3. In this case, D wins if b > 3/2, otherwise 
D will be invaded. These types of elementary processes, however, do not modify 
the system behavior drastically [19], therefore the subsequent investigations will be 
focused on the case b > 3/2. 

The above dynamical rules introduce some noises (irrational choices) in the sys- 
tem evolution. Now an additional (superimposing) noisy term is introduced by 
allowing the appearance of mutants with probability P. In fact, the effect of two 
different external constraints (mutation mechanisms) will be studied in models A 
and B as a function of P. 

Model A 

In the first model, the above evolutionary rule is modified as follows. Each 
randomly chosen player adopts with probability P a randomly chosen strategy 
among the two other strategies. With probability 1 — P it follows the old rule. 

This model can describe the behavior of those biological and economical problems 
where the appearance of mutants cannot be neglected [4,2]. The main feature of 
this model is that this mutation mechanism does not allow the extinction of any 



strategy. 



Model B 

In the second model the mutation mechanism is restricted to the adoption of C 
strategies [19]. In other words, the randomly chosen player adopts the C strat- 
egy with probability P, otherwise it adopts one of its neighboring strategy if this 
neighbor has higher income. Note that in this case the extinction of the D and/or 
T strategies is permitted. 

Model B is devoted to describe the effect of an external constraint which enforces 
the cooperative behavior naively by supporting an unconditional cooperation. Such 
a phenomenon can be observed in human societies in which any kind of social 
pressure enforces the D and T players to choose the C strategy. Furthermore, a T 
player surrounded by only cooperating strategies (C or T) is motivated to adopt 
the C strategy also because of its convenience. In fact, playing C is simpler than 
playing T, which requires the knowledge of the previous decision of your neighbors. 



MEAN-FIELD APPROXIMATION 

In the classical mean-field approximation the system is described by the strategy 
concentrations which satisfy the normalization condition Cr>(t) + cc(t) +c T (t) = 1. 
In this approach the average payoffs are given as: 

m D = bc c , 

m c = c c + c T , (4) 
rriT = c c + c T ■ 

For model A, the time dependent concentrations satisfy the following equations of 
motion: 

P 

cd = —{cc + c T - 2c d ) =F (1 - P)c D {cc + c T ) , 

p 

cc = -(or + c D - 2cc) ± (1 - P)c D c c , (5) 
P 

c t = -^{cd + c c - 2c T ) ± (1 - P)c D c T , 

where the upper (lower) signs are valid if tcid < rnc — m T ( m D > m c — %)■ 
In these expressions the first terms describe the effect of external constraint, the 
second terms come from the Darwinian selection mechanism. 

After some algebraic manipulations one can easily get the following stationary 
solution (for P < 1): 



1 + P/2 - y/l -P + 9P 2 /4 

° D = 2(1 -P) 

1 - C D 
c c = c T = —^ . 

Here all the three strategies are present for arbitrary values of P. Notice that 
the concentrations of C and T strategies are the same due to the symmetries of 
Eqs. (5). In the limit P — > 0, however, the concentration of D strategy vanishes. 
Evidently, the concentration of the three strategies becomes equal when the evolu- 
tion is governed exclusively by the mutation (P = 1). 

For model B the corresponding equations of motion are similar to those given by 
Eqs. (5), the differences appear in the first terms proportional to P. Namely, 

c D = -Pc D =F (1 - P)c D {c c + c T ) , 

c c = +P(c T + c D ) ± (1 - P)c D c c , (7) 
c T = -P2c T ± (1 - P)c D c T , 

where the average payoff values are given by Eqs. (4) and the conditions of validity 
of the upper and lower signs are defined as above. The analytical solution of these 
equations predicts strikingly different behavior in the stationary state [19], that is, 
for < P < 1/2 

1 - 2P 

CD = T^p> 

c c = jz-j5 > (8) 
c T = , 

while the system goes to the absorbing state {cq = 1 and cd = ct = 0) for P > 1/2. 
The most surprising result is the extinction of T strategy if P > 0. 

We have to emphasize the non-analytical behavior in the limit P — > 0. As il- 
lustrated in the upper plot of Figure 3, without the mutation (P = 0) the system 
evolves toward either a homogeneous D state (cd = 1) or a mixed state composed 
of C and T strategies with a ratio depending on the initial conditions. However, 
the homogeneous D state is unstable against T invasions, therefore in the close 
vicinity of this state some small perturbations can drive the system toward the 
state of C+T. Conversely, this mixed state becomes unstable at a given concen- 
tration (where itld = mc = ttlt) against small perturbations increasing cc and cd 
simultaneously. In other words, the system evolves toward the D dominance when 
the state is positioned on the right hand side of dashed line (see the upper plot in 
Figure 3 as a results of fluctuations. This feature explains why the system is so 
sensitive to applied external constraints. 

Figure 3 illustrates that for model A the mutation drives the (concentration) 
trajectories away from the boundaries. On the right hand side of the dashed line 
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FIGURE 3. Trajectories describing the time evolution in ternary diagrams for respectively the 
model with no mutation, model A (P = 0.1) and model B (P = 0.1). The dashed lines separate 
the regions where D dominates or is dominated. The fixed points are denoted by bullets. 



(rriD > ttic — %), cc and m D decrease while cp and ct increase until one crosses 
the dashed line. On the left hand side all the initial states tend toward the only- 
fixed point given by Eq. (6). For model B, however, there is no fixed point on the 
left hand side. In this region the external insertion of C strategies increases the 
value of cc until mo becomes larger than mc = mx and then the D invasion drives 
the system toward the fixed point defined by Eq. (8). During the D invasion the 
external constraint can compensate only the loss of C strategies. Consequently, the 
T strategies die out exponentially fast. 

Notice that the variation of b leaves the fixed points unchanged, but modifies 
only the slope of the dashed line separating the two regions mentioned above. 

Within the framework of mean-field theory, the extinction of T strategies is a 
consequence of the fact that here mx = mc [see Eq. (4)] in contrary to the spatially 
extended illustrated in Figure 1. 

MONTE CARLO SIMULATIONS 

Systematic Monte Carlo simulations have been performed on a square lattice 
consisting of L x L sites with periodic boundary conditions, L varying from 200 
to 1500. The larger sizes were used in the vicinity of the critical points. Each run 
started from a random initial state. During the simulations we have monitored the 
number of players playing a given strategy (N a ; a = D, C or T) and the payoffs 
related to a given strategy. After some relaxation time we have determined the 
average concentrations 

c a = (N a )/L 2 (9) 

and the fluctuations 

Xa = L 2 ((N a /L 2 -c a ) 2 } (10) 

by averaging over a sampling time varying between 10 4 and 10 6 Monte Carlo steps 
(MCS) per sites. The results obtained respectively for model A and B are the 
following. 

Results for Model A 

Figure 4 shows a typical strategy distribution for the stationary state at a small 
value of P. In contrary to the mean-field prediction [see Eqs. (6)] the system is 
dominated by the T strategies. The randomly inserted D and C strategies form 
small islands. Occasionally the larger C islands are occupied by Ds, however, a 
consecutive T invasion will eliminate the larger D territories and maintains the T 
dominance. At the same time this process prevents the formation of large C islands 
inside a T domain. 




FIGURE 4. Typical stationary state distribution of defector (closed black square), cooperator 
(white area), and Tit for Tat (open square) strategies in model A, on a 100 x 100 portion of a 
large system for P = 0.02 and b > 3/2. 



One can observe in Figure 5 that when increasing the value of P, the concen- 
tration of D and C strategies increases monotonously. In the limit P — * 1, the 
strategy distribution on the lattice tends toward a random (uncorrelated) one 
cd — Cc — °t — 1/3 in agreement with the classical mean-field theory [see Eq. 
(6)]. In this case, instead of the neighbor invasions, the system evolution is ruled 
by the stochastic mutation mechanism. 

As shown in Figure 5, the Monte Carlo data agree remarkably well with the 
results of the pair approximation. This pair approximation is considered as a 
generalized mean-field theory taking the nearest-neighbor correlations explicitly 
into account. The details of this calculation are available in many previous works 
[18-20]. The good agreement refers to the absence of long-range correlations which 
is observable in the "homogeneous" strategy distribution (see Figure 4). It is worth 
mentioning that the pair approximation is capable to describe the dominance of T 
strategies in the limit P — > 0. 



Results for Model B 



In order to visualize the relevant differences between the two models at small 
P values the strategy distribution for model B is displayed in Figure 6. When 
comparing the corresponding snapshots (Figures 4 and 6) the reader can easily 
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FIGURE 5. Stationary strategy concentrations in model A as a function of P for b > 3/2. The 
Monte Carlo results are represented by closed squares (D), open diamonds (C), and open triangles 
(T). The prediction of pair approximation is indicated by solid, dashed, and long-dashed lines 
respectively. 

recognize the most striking differences. Namely, the appearance of a strongly cor- 
related spatial structure for model B. In this case the formation of large C domains 
inside the sea of T strategies is not prevented by the random appearance of D 
mutants as happened in the previous case. The large C domains (white areas in 
Figure 6), however, are unprotected against the D invasion. Figure 6 shows some 
D domains (black areas) invading the C's territories. These D domains are "strip- 
like" because their territories are invaded simultaneously by the T strategies. This 
invasion process is similar also for larger values of P (but < P < P c i, see later), 
and only the average invasion velocity changes. On the other hand, the randomly 
inserted C strategies survive and accumulate in the T domains. Consequently, far 
behind the T-D invasion front the T's territory will be occupied by the externally 
inserted Cs and then this area becomes unprotected against the D invasion. Sooner 
or later this area will be invaded by Ds and the above process repeats itself. This 
means that the cyclic invasion maintains a self-organizing domain structure. Here 
we have to emphasize that this cyclic (rock-scissors-paper game like) dominance is 
provided by this external constraint. 

Similar processes are observed in the forest-fire models [21,22] introduced by Bak 
et al. [23] to model the phenomenon of self-organized criticality. In these models 
each cell can be in one of the following three states: non-burning tree, burning 
tree and ash. The dynamics are governed by cyclic dominance, similarly to our 
model B. Note that the consequence of cyclic invasion with three (or more) states 
are studied in Lotka-Volterra models [24-26,20] and in cyclically dominated voter 




FIGURE 6. Typical snapshot for the stationary state of model B at P = 0.02 and for b > 3/2. 
The symbols are the same as in Figure 4. 

models [27-31]. 

In model B the transition from the T to C state introduces a characteristic 
length and time unit, both proportional to 1/-P. In other words, this length unit 
is characteristic to the typical (linear) size of the T+C domain, and the time unit 
corresponds to periodic time of cyclic invasion processes at a given site. 

When increasing the value of P, the typical size of T+C domains decreases and 
the concentration of D's increases. It is found that the T strategies die out if 
P > P cl = 0.1329(1). Figure 7 shows a typical snapshot in the vicinity of this crit- 
ical value. In this case the external support is sufficiently strong to maintain small 
C clusters inside the D domains. The most remarkable feature of this snapshot is 
that the T's form non-uniformly distributed small (isolated) colonies. The observa- 
tion of time evolution of configuration shows that these T colonies walk randomly, 
they can extinct spontaneously, a single colony can split into two, or two colonies 
can merge. This phenomenon is analogous to the branching annihilating random 
walks (BARW) exhibiting a critical transition when varying the control parame- 
ters [32,33]. The corresponding critical transitions, both for our model B and for 
BARW, belongs to the so-called directed percolation universality class [34,35]. 

For P > P c i, the concentration of D decreases monotonously if P is increased 
and vanishes at P = P c2 = 0.3678(1). This extinction process is similar to the 
previous one, i.e. it also belongs to the DP universality class. The similarity in the 
correlations is recognizable in the spatial distribution of the extincting strategies 



FIGURE 7. Stationary strategy distribution in model B for P = 0.13 and b > 3/2. The symbols 
are the same as in Figure 4. 

when comparing the snapshots displayed in Figures 7 and 8. 

For P > P c2 , any initial state evolves toward the absorbing state where all the 
players follow the C strategy. 

The results of our systematic investigations are summarized in Figure 9. System- 
atic numerical investigations in the close vicinity of the critical points show that 
the vanishing concentrations follow the same power law behavior. Namely, 

C T = (Pel ~ Pf , 

C D =(Pc2-Pf, (11) 

in the limits P c \ — P — > and P c2 — P — > respectively and (3 = 0.57(3) in both 
cases [19]. Within the statistical error this value of the exponent (5 agrees with the 
one of the 2+1 dimensional directed percolation [36,37]. 

As expected, these critical transitions are accompanied with the divergence of 
concentration fluctuations, i.e. 

XT = (Pel - iT 7 , 

XD = (Pc2 - PY 1 , (12) 

in the vicinity of the corresponding critical points. The numerical fitting yields 
7 = 0.37(9) in agreement with the DP values [36,37,34,35]. 

Despite the same universal behavior there is a remarkable difference between the 
two extinction processes. The second extinction process (at P = P c2 ) results in a 




FIGURE 9. Simulation and pair- approximation results for the stationary concentration of 
strategies in model B, versus P. The notation agrees with those of Fig. 5 and the arrows 
indicate the critical points where T and D strategies extinct. 



frozen (time independent) absorbing state. Conversely, the transition at P = P cl 
is an example where the extinction of T strategies happens on a fluctuating back- 
ground. In other words, the properties of the absorbing state (frozen or fluctuating) 
do not affect the critical behavior of our model. 

As demonstrated in Figure 9 the results of Monte Carlo simulations are repro- 
duced qualitatively well by the pair approximation [19]. The striking differences 
are related to the long-range correlations accompanying the critical transitions at 
P = P c i and P c2 . Due to the strongly correlated domain structure, illustrated in 
Figure 6, the largest deviation can be observed for small P values. We note that 
the concentration fluctuations, defined by Eq. (10), also diverge in the limit P — > 0. 
Unfortunately, in this particular case, we could not deduce a reliable value for the 
exponent 7 because of the significant size effects. Further systematic analyzes are 
required to clarify what happens in this limit. 

CONCLUSIONS 

We have studied quantitatively the effect of external constraints on the emer- 
gence of cooperation in an evolutionary prisoner's dilemma game with three possi- 
ble strategies (cooperation, defection and tit for tat). In the present spatial model 
the players are distributed on a square lattice and their interactions are restricted 
to nearest neighbors. The Darwinian selection rule is modeled by the adoption of 
the neighboring successful strategies. This evolutionary process is superimposed by 
two types of mutation mechanisms (external constraints) whose strength is charac- 
terized by a control parameter P. 

The choice of these three possible strategies yields non-analytical behavior in the 
limit P — > for both the mean-field approximation and Monte Carlo simulation. 
The time-dependent predictions of mean-field theory are sensitive to the small 
perturbations. 

According to the Monte Carlo simulations, in the absence of external constraint 
the system tends toward a frozen state composed from C and T strategies whose 
ratio depends on the initial concentrations. For both types of external constraints 
(models A and B) the system evolves toward a stationary state independently of 
the initial condition, and the defector concentration vanishes linearly as P — > 0. In 
the limit P — > 0, however, model A and B will exhibit different ratio of C and T 
strategies. This difference is related to the appearance of self-organizing patterns 
for model B. The present investigation indicates that such a society of strategies 
(or species) are very sensitive to the type of external supports (or the ration of 
mutation rates). 

The measure of mutual cooperation can be well characterized by the average pay- 
off whose maximum (4) can be reached only in the absence of defectors. Figure 10 
compares the Monte Carlo results for the models A and B. Surprisingly, for weak 
external support (small P) the average payoff is larger for model A than for model 
B. In contrary to the naive expectation, the weak support of defenseless coopera- 



tors results in opposite consequence. Namely, this mechanism feeds the defectors 
and simultaneously prevents their elimination by the retaliatory (T) strategies. 

Examples from the political and economical world justifies the above conclu- 
sions. In general, the exploiters are preferred by the governmental support for the 
defenseless layer of a society. The most dangerous effect is the reduction in the T 
type population which can maintain the mutual cooperation against the exploiters. 
From the view point of cooperation, it is better to help those individuals who are 
able to prevent themselves against the exploitation. 



o 

Sh 

> 



+ • 
+ • 

+ 
+ 



0.0 



+ 
+ 
+ 



/ 



+ + + + 



++ 



+ 
+ 



+ 

+ + + 



+ 



0.1 



0.2 



0.3 



0.4 



0.5 



FIGURE 10. Comparison of average payoffs as a function of P for the models A (closed circles) 
and B (plus symbols). The sharp minimum coincides with the extinction of T strategies for model 
B. 



Evidently, for sufficiently large P values the random insertion of C strategies can 
provide their dominance. In this case the A type external support is preferred to 
the B one if we wish to improve cooperation. Above a threshold value this type 
of external constraints yields a homogeneous C state which is defenseless against 
any defector appearing occasionally in a real system. Further systematic research 
is required to clarify what happens in those models where the mutation mechanism 
is characterized by three independent control parameters. 

The present study confirms that the T strategy is able to prevent the spreading 
of defection in the spatial models. We have to emphasize, however, that according 
to the simplest mean-field theory, T dies out if the external support is of B type. 
Consequently, the defectors will dominate those systems where the mean-field the- 
ory is exact (e.g. infinite range of interaction, or randomly chosen partnership). In 
these mean- field like systems, the games between the "parent " and "its offspring" 
is not emphasized (they are not neighbors), which is an advantage for the defectors 
comparing to spatially extended models. In the light of this feature our investi- 



gations imply many interesting questions related to the transition from the "short 
range" spatially extended systems to the "long range" of mean-field like ones. 
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